智能论文笔记

Human Mobility Modeling During the COVID-19 Pandemic via Deep Graph Diffusion Infomax

Yang Liu , Yu Rong , Zhuoning Guo , Nuo Chen , Tingyang Xu , Fugee Tsung , Jia Li

分类：机器学习

2022-12-12

Non-Pharmaceutical Interventions (NPIs), such as social gathering restrictions, have shown effectiveness to slow the transmission of COVID-19 by reducing the contact of people. To support policy-makers, multiple studies have first modeled human mobility via macro indicators (e.g., average daily travel distance) and then studied the effectiveness of NPIs. In this work, we focus on mobility modeling and, from a micro perspective, aim to predict locations that will be visited by COVID-19 cases. Since NPIs generally cause economic and societal loss, such a micro perspective prediction benefits governments when they design and evaluate them. However, in real-world situations, strict privacy data protection regulations result in severe data sparsity problems (i.e., limited case and location information). To address these challenges, we formulate the micro perspective mobility modeling into computing the relevance score between a diffusion and a location, conditional on a geometric graph. we propose a model named Deep Graph Diffusion Infomax (DGDI), which jointly models variables including a geometric graph, a set of diffusions and a set of locations.To facilitate the research of COVID-19 prediction, we present two benchmarks that contain geometric graphs and location histories of COVID-19 cases. Extensive experiments on the two benchmarks show that DGDI significantly outperforms other competing methods.

translated by 谷歌翻译

Similarity-aware Positive Instance Sampling for Graph Contrastive Pre-training

Xueyi Liu , Yu Rong , Tingyang Xu , Fuchun Sun , Wenbing Huang , Junzhou Huang

分类：机器学习 | 人工智能

2022-06-23

图对比度学习已被证明是图形神经网络（GNN）预训练的有效任务。但是，一个关键问题可能会严重阻碍现有作品中的代表权：当前方法创建的积极实例通常会错过图表的关键信息，甚至会错过非法实例（例如分子生成中的非化学意识图）。为了解决此问题，我们建议直接从训练集中的现有图中选择正图实例，该实例最终保持与目标图的合法性和相似性。我们的选择基于某些特定于域的成对相似性测量以及从层次图编码图中的相似性关系的采样。此外，我们开发了一种自适应节点级预训练方法，以动态掩盖节点在图中均匀分布。我们对来自各个域的$ 13 $图形分类和节点分类基准数据集进行了广泛的实验。结果表明，通过我们的策略预先培训的GNN模型可以胜过那些训练有素的从划痕模型以及通过现有方法获得的变体。

translated by 谷歌翻译

Hypergraph Convolutional Networks via Equivalency between Hypergraphs and Undirected Graphs

Jiying Zhang , Fuyang Li , Xi Xiao , Tingyang Xu , Yu Rong , Junzhou Huang , Yatao Bian

分类：机器学习

2022-03-31

作为建模复杂关系的强大工具，HyperGraphs从图表学习社区中获得了流行。但是，深度刻画学习中的常用框架专注于具有边缘独立的顶点权重（EIVW）的超图，而无需考虑具有具有更多建模功率的边缘依赖性顶点权重（EDVWS）的超图。为了弥补这一点，我们提出了一般的超图光谱卷积（GHSC），这是一个通用学习框架，不仅可以处理EDVW和EIVW HyperGraphs，而且更重要的是，理论上可以明确地利用现有强大的图形卷积神经网络（GCNN）明确说明，从而很大程度上可以释放。超图神经网络的设计。在此框架中，给定的无向GCNN的图形拉普拉斯被统一的HyperGraph Laplacian替换，该统一的HyperGraph Laplacian通过将我们所定义的广义超透明牌与简单的无向图等同起来，从随机的步行角度将顶点权重信息替换。来自各个领域的广泛实验，包括社交网络分析，视觉目标分类和蛋白质学习，证明了拟议框架的最新性能。

translated by 谷歌翻译

Neighbour Interaction based Click-Through Rate Prediction via Graph-masked Transformer

Erxue Min , Yu Rong , Tingyang Xu , Yatao Bian , Peilin Zhao , Junzhou Huang , Da Luo , Kangyi Lin , Sophia Ananiadou

分类：人工智能 | 机器学习

2022-01-25

点击率（CTR）预测旨在估算用户单击项目的可能性，是在线广告的重要组成部分。现有方法主要尝试从用户的历史行为中挖掘用户兴趣，这些行为包含用户直接交互的项目。尽管这些方法取得了长足的进步，但通常会受到推荐系统的直接曝光和不活动相互作用的限制，因此无法挖掘所有潜在的用户利益。为了解决这些问题，我们提出了基于邻居相互作用的CTR预测（NI-CTR），该预测在异质信息网络（HIN）设置下考虑此任务。简而言之，基于邻居相互作用的CTR预测涉及HIN目标用户项目对的本地邻域以预测其链接。为了指导当地社区的表示形式，我们从显式和隐性的角度考虑了本地邻里节点之间的不同类型的相互作用，并提出了一种新颖的图形掩盖变压器（GMT），以有效地将这些类型的交互结合到为目标用户项目对生成高度代表性的嵌入。此外，为了提高针对邻居采样的模型鲁棒性，我们在嵌入邻里的嵌入式上执行了一致性正规化损失。我们对数百万个实例进行了两个现实世界数据集进行了广泛的实验，实验结果表明，我们所提出的方法的表现明显优于最先进的CTR模型。同时，全面的消融研究验证了我们模型每个组成部分的有效性。此外，我们已经在具有数十亿用户的微信官方帐户平台上部署了此框架。在线A/B测试表明，针对所有在线基线的平均CTR改进为21.9。

translated by 谷歌翻译

Towards the Explanation of Graph Neural Networks in Digital Pathology with Information Flows

Junchi Yu , Tingyang Xu , Ran He

分类：机器学习 | 人工智能

2021-12-18

作为图形神经网络（GNNS）在数字病理学中被广泛采用，越来越关注GNN的发出解释模型（解释器），以提高临床决策的透明度。现有的解释者发现与预测相关的解释性子图。然而，这种子图不足以揭示预测的所有关键生物学子结构，因为在去除该子图之后预测将保持不变。因此，解释性子图不仅应该需要预测，而且应该足以揭示用于解释的最具预测区域。这种解释需要测量从不同输入子图传送到预测输出的信息，我们将其定义为信息流。在这项工作中，我们解决了这些关键挑战并提出了IFExplainer，它为GNN产生了必要和充分的解释。为了评估GNN预测中的信息流，我们首先提出了一种新颖的预测性概念，命名为$ F $ -Information，它是定向的，并包含GNN模型的现实容量。基于它，IFExplainer产生具有最大信息流到预测的解释性子图。同时，在去除解释之后，它最小化了从输入到预测结果的信息流。因此，所产生的解释对于预测并且足以揭示最重要的子结构是重要的。我们评估IFExplainer以解释GNN对乳腺癌亚型的预测。 BRACS数据集的实验结果显示了该方法的卓越性能。

translated by 谷歌翻译

Contrastive Cycle Adversarial Autoencoders for Single-cell Multi-omics Alignment and Integration

Xuesong Wang , Zhihang Hu , Tingyang Yu , Ruijie Wang , Yumeng Wei , Juan Shu , Jianzhu Ma , Yu Li

分类：机器学习

2021-12-05

Muilti-Delicality数据在生物学中普遍存在，特别是我们进入了多OMICS时代，当我们可以测量来自不同方面（OMIC）的相同生物对象（单元）来提供更全面的洞察蜂窝系统。在处理此类多个OMICS数据时，第一步是确定不同模式之间的对应关系。换句话说，我们应该与与相同对象相对应的不同空格匹配数据。这个问题在单细胞多OMICS场景中特别具有挑战性，因为这种数据具有极高的尺寸。其次，匹配的单细胞多OMICS数据是罕见的且难以收集的。此外，由于实验环境的局限性，数据通常非常嘈杂。为了促进单细胞多OMICS研究，我们克服了上述挑战，提出了一种新颖的框架来对齐和集成单细胞RNA-SEQ数据和单细胞ATAC-SEQ数据。我们的方法可以通过在统一空间中有效地将上述数据与来自不同空间的高稀疏性和噪声从不同空间的噪声映射到低维歧管，使下游对准和直接集成。与其他最先进的方法相比，我们的方法在模拟和实际单细胞数据中执行更好。所提出的方法有助于单细胞多OMICS研究。对模拟数据集成的改进是显着的。

translated by 谷歌翻译

Local Augmentation for Graph Neural Networks

Songtao Liu , Hanze Dong , Lanqing Li , Tingyang Xu , Yu Rong , Peilin Zhao , Junzhou Huang , Dinghao Wu

分类：机器学习

2021-09-08

数据增强已广泛用于图像数据和语言数据，但仍然探索图形神经网络（GNN）。现有方法专注于从全局视角增强图表数据，并大大属于两个类型：具有特征噪声注入的结构操纵和对抗训练。但是，最近的图表数据增强方法忽略了GNNS“消息传递机制的本地信息的重要性。在这项工作中，我们介绍了本地增强，这通过其子图结构增强了节点表示的局部。具体而言，我们将数据增强模拟为特征生成过程。鉴于节点的功能，我们的本地增强方法了解其邻居功能的条件分布，并生成更多邻居功能，以提高下游任务的性能。基于本地增强，我们进一步设计了一个新颖的框架：La-GNN，可以以即插即用的方式应用于任何GNN模型。广泛的实验和分析表明，局部增强一致地对各种基准的各种GNN架构始终如一地产生性能改进。

translated by 谷歌翻译

Tackling Over-Smoothing for General Graph Convolutional Networks

Wenbing Huang , Yu Rong , Tingyang Xu , Fuchun Sun , Junzhou Huang

分类：机器学习 | (统计)机器学习

2020-08-22

提高GCN的深度（预计将允许更多表达性）显示出损害性能，尤其是在节点分类上。原因的主要原因在于过度平滑。过度平滑的问题将GCN的输出驱动到一个在节点之间包含有限的区别信息的空间，从而导致表现不佳。已经提出了一些有关完善GCN架构的作品，但理论上仍然未知这些改进是否能够缓解过度平衡。在本文中，我们首先从理论上分析了通用GCN如何与深度增加的作用，包括通用GCN，GCN，具有偏见，RESGCN和APPNP。我们发现所有这些模型都以通用过程为特征：所有节点融合到Cuboid。在该定理下，我们建议通过在每个训练时期随机去除一定数量的边缘来减轻过度光滑的状态。从理论上讲，Dropedge可以降低过度平滑的收敛速度，或者可以减轻尺寸崩溃引起的信息损失。对模拟数据集的实验评估已可视化不同GCN之间过度平滑的差异。此外，对几个真正的基准支持的广泛实验，这些实验始终如一地改善各种浅GCN和深度GCN的性能。

translated by 谷歌翻译

DropEdge: Towards Deep Graph Convolutional Networks on Node Classification

Yu Rong , Wenbing Huang , Tingyang Xu , Junzhou Huang

分类：

2019-07-25

Over-fitting and over-smoothing are two main obstacles of developing deep Graph Convolutional Networks (GCNs) for node classification. In particular, over-fitting weakens the generalization ability on small dataset, while over-smoothing impedes model training by isolating output representations from the input features with the increase in network depth. This paper proposes DropEdge, a novel and flexible technique to alleviate both issues. At its core, DropEdge randomly removes a certain number of edges from the input graph at each training epoch, acting like a data augmenter and also a message passing reducer. Furthermore, we theoretically demonstrate that DropEdge either reduces the convergence speed of over-smoothing or relieves the information loss caused by it. More importantly, our DropEdge is a general skill that can be equipped with many other backbone models (e.g. GCN, ResGCN, GraphSAGE, and JKNet) for enhanced performance. Extensive experiments on several benchmarks verify that DropEdge consistently improves the performance on a variety of both shallow and deep GCNs. The effect of DropEdge on preventing over-smoothing is empirically visualized and validated as well. Codes are released on https://github.com/DropEdge/DropEdge.

translated by 谷歌翻译

Handling Missing Data via Max-Entropy Regularized Graph Autoencoder

Ziqi Gao , Yifan Niu , Jiashun Cheng , Jianheng Tang , Tingyang Xu , Peilin Zhao , Lanqing Li , Fugee Tsung , Jia Li

分类：机器学习

2022-11-30

Graph neural networks (GNNs) are popular weapons for modeling relational data. Existing GNNs are not specified for attribute-incomplete graphs, making missing attribute imputation a burning issue. Until recently, many works notice that GNNs are coupled with spectral concentration, which means the spectrum obtained by GNNs concentrates on a local part in spectral domain, e.g., low-frequency due to oversmoothing issue. As a consequence, GNNs may be seriously flawed for reconstructing graph attributes as graph spectral concentration tends to cause a low imputation precision. In this work, we present a regularized graph autoencoder for graph attribute imputation, named MEGAE, which aims at mitigating spectral concentration problem by maximizing the graph spectral entropy. Notably, we first present the method for estimating graph spectral entropy without the eigen-decomposition of Laplacian matrix and provide the theoretical upper error bound. A maximum entropy regularization then acts in the latent space, which directly increases the graph spectral entropy. Extensive experiments show that MEGAE outperforms all the other state-of-the-art imputation methods on a variety of benchmark datasets.

translated by 谷歌翻译